Universal SIMD-Mathlibrary
نویسنده
چکیده
Standard functions for single precision floating point vector datatypes are provided for the SIMD-platforms x86 (SSE2), PowerPC and Cell. In most cases, speed and/or accuracy compare favourable with existing SIMDlibraries (MacOS Accelerate Framework, Cell SDK). Most of the algorithms are based on those of the Cephes library, while the implementation is branchfree and parallelized for minimum pipeline stalls. The Universal SIMD Mathlibrary (usm) provides the functions sin, cos, tan, asin, acos, atan, atan2, sqrt, exp, log, pow, abs, ceil, floor, ldexp, and frexp. It is licensed under the GPL3.
منابع مشابه
UMAC: Fast and Secure Message Authentication
We describe a message authentication algorithm, UMAC, which can authenticate messages (in software, on contemporary machines) roughly an order of magnitude faster than current practice (e.g., HMAC-SHA1), and about twice as fast as times previously reported for the universal hash-function family MMH. To achieve such speeds, UMAC uses a new universal hash-function family, NH, and a design which a...
متن کاملRegular and almost universal hashing: an efficient implementation
Random hashing can provide guarantees regarding the performance of data structures such as hash tables— even in an adversarial setting. Many existing families of hash functions are universal: given two data objects, the probability that they have the same hash value is low given that we pick hash functions at random. However, universality fails to ensure that all hash functions are well behaved...
متن کاملModeling Universal Instruction Selection
Instruction selection implements a program under compilation by selecting processor instructions and has tremendous impact on the performance of the code generated by a compiler. This paper introduces a graph-based universal representation that unifies data and control flow for both programs and processor instructions. The representation is the essential prerequisite for a constraint model for ...
متن کاملA Programmable, Scalable-Throughput Interleaver
The interleaver stages of digital communication standards show a surprisingly large variation in throughput, state sizes, and permutation functions. Furthermore, data rates for 4G standards such as LTE-Advanced will exceed typical baseband clock frequencies of handheld devices. Multistream operation for Software Defined Radio and iterative decoding algorithms will call for ever higher interleav...
متن کاملConcurrent Processing Memory
A theoretical memory with limited processing power and internal connectivity at each element is proposed. This memory carries out parallel processing within itself to solve generic array problems. The applicability of this in-memory finest-grain massive SIMD approach is studied in some details. For an array of N items, it reduces the total instruction cycle count of universal operations such as...
متن کامل